-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cdc): amortize snapshot read to multiple barriers #16349
Conversation
Can you describe the background/motivation? |
To implement this RFC we discussed before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -69,6 +71,11 @@ pub struct CdcBackfillExecutor<S: StateStore> { | |||
rate_limit_rps: Option<u32>, | |||
|
|||
disable_backfill: bool, | |||
|
|||
// TODO: make these options configurable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be done in #16426
@@ -98,6 +98,8 @@ impl ExecutorBuilder for StreamCdcScanExecutorBuilder { | |||
state_table, | |||
node.rate_limit, | |||
disable_backfill, | |||
1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snapshot_interval
will default to 1
, and we allow user to config different value in WITH clause #16426
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM
Evaluated, the barrier doesn't pile up with interval = 5 barriers. Right image is the baseline (1.8.1-rc). |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
We ensure to poll the snapshot stream at least once in each epoch previously, which may cause barrier pile up in some scenario (#15812)
This PR implement the RFC to allow reconstruct a new snapshot stream in each
snapshot_interval
barriers, trying to mitigate the barrier pile up problem.snapshot_interval
to configure the barrier interval to start a new snapshotsnapshot_read_full_table
to read all historical data in batches to simplify main logicfollow-up: #16426
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.